Sibiu County
ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models
Dumitru, Razvan-Gabriel, Peteleaza, Darius, Yadav, Vikas, Pan, Liangming
Large language models excel at complex tasks by breaking down problems into structured reasoning steps. However, reasoning traces often extend beyond reaching a correct answer, causing wasted computation, reduced readability, and hallucinations. To address this, we introduce a novel hyperparameter-free conciseness score used as a reward signal within a reinforcement learning framework to guide models toward generating correct and concise reasoning traces. This score is evaluated by a large language model acting as a judge, enabling dynamic, context-aware feedback beyond simple token length. Our method achieves state-of-the-art efficiency-accuracy trade-offs on the MATH dataset, reducing token usage by up to 31x on simple problems while improving accuracy by 7%, and on the hardest problems, it outperforms full reasoning by +7.5% accuracy with up to 3.6x fewer tokens. On TheoremQA, our method improves accuracy by +2.2% using 12.5x fewer tokens. We also conduct ablation studies on the judge model, reward composition, and problem difficulty, showing that our method dynamically adapts reasoning length based on problem difficulty and benefits significantly from stronger judges. The code, model weights, and datasets are open-sourced at https://github.com/RazvanDu/ConciseRL.
- North America > United States > Arizona (0.04)
- Europe > Romania > Centru Development Region > Sibiu County > Sibiu (0.04)
- Asia > Singapore (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Extension-ranking Semantics for Abstract Argumentation Preprint
Skiba, Kenneth, Rienstra, Tjitze, Thimm, Matthias, Heyninck, Jesse, Kern-Isberner, Gabriele
In this paper, we present a general framework for ranking sets of arguments in abstract argumentation based on their plausibility of acceptance. We present a generalisation of Dung's extension semantics as extension-ranking semantics, which induce a preorder over the power set of all arguments, allow ing us to state that one set is "closer" to being acceptable than another . To evaluate the extension-ranking semantics, we introduce a number of p rinciples that a well-behaved extension-ranking semantics should satisfy. W e consider several simple base relations, each of which models a single central a spect of argumentative reasoning. The combination of these base relations provides us with a family of extension-ranking semantics. We also adapt a numb er of approaches from the literature for ranking extensions to be us able in the context of extension-ranking semantics, and evaluate their beha viour. Keywords: Abstract Argumentation, Ranking Sets of Objects, Extension-ranking semantics 1. Introduction Formal argumentation [7] is concerned with models of rational decis ion-making based on representations of arguments and their relations.
- Europe > Austria > Vienna (0.14)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.13)
- Africa > South Africa > Western Cape > Cape Town (0.04)
- (13 more...)
Scalability and Maintainability Challenges and Solutions in Machine Learning: Systematic Literature Review
Shivashankar, Karthik, Hajj, Ghadi S. Al, Martini, Antonio
This systematic literature review examines the critical challenges and solutions related to scalability and maintainability in Machine Learning (ML) systems. As ML applications become increasingly complex and widespread across industries, the need to balance system scalability with long-term maintainability has emerged as a significant concern. This review synthesizes current research and practices addressing these dual challenges across the entire ML life-cycle, from data engineering to model deployment in production. We analyzed 124 papers to identify and categorize 41 maintainability challenges and 13 scalability challenges, along with their corresponding solutions. Our findings reveal intricate inter dependencies between scalability and maintainability, where improvements in one often impact the other. The review is structured around six primary research questions, examining maintainability and scalability challenges in data engineering, model engineering, and ML system development. We explore how these challenges manifest differently across various stages of the ML life-cycle. This comprehensive overview offers valuable insights for both researchers and practitioners in the field of ML systems. It aims to guide future research directions, inform best practices, and contribute to the development of more robust, efficient, and sustainable ML applications across various domains.
- Europe > Austria > Vienna (0.13)
- Europe > Norway > Eastern Norway > Oslo (0.04)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- (10 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Information Technology > Services (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- (2 more...)
Tropical Bisectors and Carlini-Wagner Attacks
Grindstaff, Gillian, Lindberg, Julia, Schkoda, Daniela, Sorea, Miruna-Stefana, Yoshida, Ruriko
Pasque et al. showed that using a tropical symmetric metric as an activation function in the last layer can improve the robustness of convolutional neural networks (CNNs) against state-of-the-art attacks, including the Carlini-Wagner attack. This improvement occurs when the attacks are not specifically adapted to the non-differentiability of the tropical layer. Moreover, they showed that the decision boundary of a tropical CNN is defined by tropical bisectors. In this paper, we explore the combinatorics of tropical bisectors and analyze how the tropical embedding layer enhances robustness against Carlini-Wagner attacks. We prove an upper bound on the number of linear segments the decision boundary of a tropical CNN can have. We then propose a refined version of the Carlini-Wagner attack, specifically tailored for the tropical architecture. Computational experiments with MNIST and LeNet5 showcase our attacks improved success rate.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (4 more...)
- Information Technology > Security & Privacy (0.46)
- Government > Military (0.46)
OpenHuEval: Evaluating Large Language Model on Hungarian Specifics
Yang, Haote, Wei, Xingjian, Wu, Jiang, Ligeti-Nagy, Noémi, Sun, Jiaxing, Wang, Yinfan, Yang, Zijian Győző, Gao, Junyuan, Wang, Jingchao, Jiang, Bowen, Wang, Shasha, Yu, Nanjun, Zhang, Zihao, Hong, Shixin, Liu, Hongwei, Li, Wei, Zhang, Songyang, Lin, Dahua, Wu, Lijun, Prószéky, Gábor, He, Conghui
We introduce OpenHuEval, the first benchmark for LLMs focusing on the Hungarian language and specifics. OpenHuEval is constructed from a vast collection of Hungarian-specific materials sourced from multiple origins. In the construction, we incorporated the latest design principles for evaluating LLMs, such as using real user queries from the internet, emphasizing the assessment of LLMs' generative capabilities, and employing LLM-as-judge to enhance the multidimensionality and accuracy of evaluations. Ultimately, OpenHuEval encompasses eight Hungarian-specific dimensions, featuring five tasks and 3953 questions. Consequently, OpenHuEval provides the comprehensive, in-depth, and scientifically accurate assessment of LLM performance in the context of the Hungarian language and its specifics. We evaluated current mainstream LLMs, including both traditional LLMs and recently developed Large Reasoning Models. The results demonstrate the significant necessity for evaluation and model optimization tailored to the Hungarian language and specifics. We also established the framework for analyzing the thinking processes of LRMs with OpenHuEval, revealing intrinsic patterns and mechanisms of these models in non-English languages, with Hungarian serving as a representative example. We will release OpenHuEval at https://github.com/opendatalab/OpenHuEval .
- North America > United States (0.14)
- Asia > China > Shanghai > Shanghai (0.04)
- Europe > Ukraine (0.04)
- (8 more...)
- Government > Regional Government (0.46)
- Education > Educational Setting (0.46)
Change Is the Only Constant: Dynamic LLM Slicing based on Layer Redundancy
Dumitru, Razvan-Gabriel, Clotan, Paul-Ioan, Yadav, Vikas, Peteleaza, Darius, Surdeanu, Mihai
This paper introduces a novel model compression approach through dynamic layer-specific pruning in Large Language Models (LLMs), enhancing the traditional methodology established by SliceGPT. By transitioning from constant to dynamic slicing, our method leverages the newly proposed Layer Redundancy (LR) score, which assesses how much change each layer changes its input by measuring the cosine similarity of the input to the output of the layer. We use this score to prune parts of individual layers based on redundancy in such a way that the average pruned percentage for all layers is a fixed value. We conducted extensive experiments using models like Llama3-8B and Mistral-7B on multiple datasets, evaluating different slicing bases and percentages to determine optimal configurations that balance efficiency and performance. Our findings show that our dynamic slicing approach not only maintains but, in many cases, enhances model performance compared to the baseline established by constant slicing methods. For instance, in several settings, we see performance improvements of up to 5% over the SliceGPT baseline. Additionally, a perplexity decrease by as much as 7% was observed across multiple benchmarks, validating the effectiveness of our method. The code, model weights, and datasets are open-sourced at https://github.com/RazvanDu/DynamicSlicing.
- North America > United States > Arizona (0.04)
- Europe > Romania > Centru Development Region > Sibiu County > Sibiu (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
Social Mediation through Robots -- A Scoping Review on Improving Group Interactions through Directed Robot Action using an Extended Group Process Model
Weisswange, Thomas H., Javed, Hifza, Dietrich, Manuel, Jung, Malte F., Jamali, Nawid
Group processes refer to the dynamics that occur within a group and are critical for understanding how groups function. With robots being increasingly placed within small groups, improving these processes has emerged as an important application of social robotics. Social Mediation Robots elicit behavioral change within groups by deliberately influencing the processes of groups. While research in this field has demonstrated that robots can effectively affect interpersonal dynamics, there is a notable gap in integrating these insights to develop coherent understanding and theory. We present a scoping review of literature targeting changes in social interactions between multiple humans through intentional action from robotic agents. To guide our review, we adapt the classical Input-Process-Output (I-P-O) models that we call "Mediation I-P-O model". We evaluated 1633 publications, which yielded 89 distinct social mediation concepts. We construct 11 mediation approaches robots can use to shape processes in small groups and teams. This work strives to produce generalizable insights and evaluate the extent to which the potential of social mediation through robots has been realized thus far. We hope that the proposed framework encourages a holistic approach to the study of social mediation and provides a foundation to standardize future reporting in the domain.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.27)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.27)
- Europe > United Kingdom > England > Greater London > London (0.14)
- (77 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Instructional Material (1.00)
- Research Report > Experimental Study (0.93)
- Law > Alternative Dispute Resolution (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
"Vorbe\c{s}ti Rom\^ane\c{s}te?" A Recipe to Train Powerful Romanian LLMs with English Instructions
Masala, Mihai, Ilie-Ablachim, Denis C., Dima, Alexandru, Corlatescu, Dragos, Zavelca, Miruna, Olaru, Ovio, Terian, Simina, Terian, Andrei, Leordeanu, Marius, Velicu, Horia, Popescu, Marius, Dascalu, Mihai, Rebedea, Traian
In recent years, Large Language Models (LLMs) have achieved almost human-like performance on various tasks. While some LLMs have been trained on multilingual data, most of the training data is in English; hence, their performance in English greatly exceeds other languages. To our knowledge, we are the first to collect and translate a large collection of texts, instructions, and benchmarks and train, evaluate, and release open-source LLMs tailored for Romanian. We evaluate our methods on four different categories, including academic benchmarks, MT-Bench (manually translated), and a professionally built historical, cultural, and social benchmark adapted to Romanian. We argue for the usefulness and high performance of RoLLMs by obtaining state-of-the-art results across the board. We publicly release all resources (i.e., data, training and evaluation code, models) to support and encourage research on Romanian LLMs while concurrently creating a generalizable recipe, adequate for other low or less-resourced languages.
- Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.05)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- (4 more...)
A Farewell to Harms: Risk Management for Medical Devices via the Riskman Ontology & Shapes
Gorczyca, Piotr, Arndt, Dörthe, Diller, Martin, Kettmann, Pascal, Mennicke, Stephan, Strass, Hannes
We introduce the Riskman ontology & shapes for representing and analysing information about risk management for medical devices. Risk management is concerned with taking necessary precautions so a medical device does not cause harms for users or the environment. To date, risk management documentation is submitted to notified bodies (for certification) in the form of semi-structured natural language text. We propose to use classes from the Riskman ontology to logically model risk management documentation, and to use the included SHACL constraints to check for syntactic completeness and conformity to relevant standards. In particular, the ontology is modelled after ISO 14971 and the recently published VDE Spec 90025. Our proposed methodology has the potential to save many person-hours for both manufacturers (when creating risk management documentation) as well as notified bodies (when assessing submitted applications for certification), and thus offers considerable benefits for healthcare and, by extension, society as a whole.
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (24 more...)
Enhancing Transformer RNNs with Multiple Temporal Perspectives
Dumitru, Razvan-Gabriel, Peteleaza, Darius, Surdeanu, Mihai
We introduce the concept of multiple temporal perspectives, a novel approach applicable to Recurrent Neural Network (RNN) architectures for enhancing their understanding of sequential data. This method involves maintaining diverse temporal views of previously encountered text, significantly enriching the language models' capacity to interpret context. To show the efficacy of this approach, we incorporate it into the Receptance Weighted Key Value (RWKV) architecture, addressing its inherent challenge of retaining all historical information within a single hidden state. Notably, this improvement is achieved with a minimal increase in the number of parameters --even as little as $0.04\%$ of the original number of parameters. Further, the additional parameters necessary for the multiple temporal perspectives are fine-tuned with minimal computational overhead, avoiding the need for a full pre-training. The resulting model maintains linear computational complexity during prompt inference, ensuring consistent efficiency across various sequence lengths. The empirical results and ablation studies included in our research validate the effectiveness of our approach, showcasing improved performance across multiple benchmarks. The code, model weights and datasets are open-sourced at: https://github.com/RazvanDu/TemporalRNNs.
- North America > United States > Arizona > Pima County > Tucson (0.14)
- Europe > Romania > Centru Development Region > Sibiu County > Sibiu (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)